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Abstract. In the recent years, Riemannian shape analysis of curves and 
surfaces has found several applications in medical image analysis. In this paper 
we present a numerical discretization of second order Sobolev metrics on the 
space of regular curves in Euclidean space. This class of metrics has several 
desirable mathematical properties. We propose numerical solutions for the 
initial and boundary value problems of finding geodesics. These two methods 
are combined in a Riemannian gradient-based optimization scheme to compute 
the Karcher mean. We apply this to a study of the shape variation in HeLa cell 
nuclei and cycles of cardiac deformations, by computing means and principal 
modes of variations. 


1. Introduction 


The comparison and analysis of geometric shapes plays an important role in 
medical imaging 27 , biology as well as many other fields . Spaces of geometric 
shapes are inherently nonlinear. To make standard methods of statistical analysis 
applicable, one can linearize the space locally around each shape. This can be 
achieved by introducing a Riemannian structure, which is able to describe jointly 
the global nonlinearity of the space as well as its local linearity. 

Over the past decade, Riemannian shape analysis has become an active area 
of research in pure and applied mathematics. Driven by applications, a variety of 
spaces, equipped with different Riemannian metrics, have been used to represent 
geometrical shapes and their attributes. Ideally, the particular choice of metric 
should be dictated by the data at hand rather than by mathematical or numerical 
convenience. This leads to the task of developing efficient numerical methods for 
the statistical analysis of shapes for general and flexible classes of metrics. 

The topic of this paper are second order Sobolev metrics on the space of regular, 
planar curves. This space and its quotients by translations, rotations, scalings, and 
reparametrizations are important and widely used spaces in shape analysis. Second 
order Sobolev metrics are mathematically well-behaved: the geodesic equation 
is globally well-posed, any two curves in the same connected component can be 
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connected by a minimizing geodesic, the metric completion consists of all H^- 
immersions, and the metric extends to a strong Riemannian metric on the metric 
completion [^[^. 

We provide the first numerical implementation of the initial and boundary value 
problems for geodesics with respect to these metrics and apply them to medical dataj^ 
Our implementation allows us to compute Karcher means, principal components, 
and clusters of curves in reasonable time. The parameters in the metric can be 
chosen freely because we do not rely on transforms which exist only for special 
choices of parameters 23 26 . We are also able to factor out the action of the 


finite-dimensional translation and rotation groups. 

We illustrate the behaviour of the metrics in two medical applications. In Sect. |4.1 
we use images of HeLa cell nuclei from and compute the mean shape of the 
nucleus and the principal modes of shape variation; in Sect. [4^ we use traces of 


cardiac image sequences from 12 24 and compare the mean shape of patients with 


Tetralogy of Fallot with the mean shape of a control group. 

The paper is structured as follows: Sect, [^contains the definition of second order 
Sobolev metrics and some relevant mathematical background. Section describes 
our discretization of the geodesic equation and the Riemannian energy functional. 
Finally, in Sect. we apply the metrics to medical imaging data. 


2. Mathematical Background 


In this article we center our attention on the space of smooth, regular curves 
with values in 

(1) Imm(S'\K‘^) = {cGC“(S'\K'^):V6le5\ce(6l)7^0} , 

where Imm stands for immersion. As an open subset of the Frechet space 

it is a Frechet manifold. Its tangent space, Imm(S'^,K"^), at any curve c is the 

vector space K^^). 

We denote the Euclidean inner product on by (•,•). Moreover, for any fixed 
curve c, we denote differentiation and integration with respect to arc length by 
Ds = and ds = \cB\d9 respectively. 

Definition 1. A second order Sobolev metric on Imm(S'^,M‘^) is given by 

(2) Gc{h,k)= ( ao{h,k) + ai{Dsh,Dsk) + a 2 {D'^h,Dlk) ds , 

Js^ 

where oq, 02 > 0, oi > 0, and h,k G TcImm(S'^,IR‘^) are tangent vectors. 

Remark 1 (Invariance of the metric). The metric G is invariant with respect to 
translations, rotations, and reparametrizations by diffeomorphisms of but not 
with respect to scalings. It can be made scale-invariant by introducing weights that 
depend on the length £c of the curve c. A scale-invariant metric is given by 

(3) G,ih,k)= [ ^{h,k) + ^{Dsh,D,k)+a2e,{D%Dlk)ds . 

Depending on the application, it can be desirable to factor out some of these isometry 
groups. For example, we consider the HeLa cell nuclei in Sect. |4.l| as curves modulo 
translations and rotations. For the traces of cardiac images in Sect. |4.2] however, 

^Implementations of second order Sobolev metrics can be found in [l5| and j^, but none of the 
previous implementations seem to have been used in applications. 
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Figure 1. Geodesics between a cat and a cow. The metric param¬ 
eter 02 in the first row is increased by a factor 10 in the second and 
a factor 100 in the third row. 



Figure 2. Karcher means (bold) of three rotated ellipses (dashed). 
The parameter 02 of the metric used in the first figure is increased 
by a factor 10 in the second and by a factor 100 in the third figure. 


both the absolute position of the curve and its orientation have intrinsic meaning 
and should therefore not be factored out. 

Remark 2 (Choosing the constants). The constants ao,ai and 02 in the definition 
of the metric determine the relative weight of the and R^-parts. Their 

choice is a non-trivial and important task, and it should, in each application, be 
informed by the data at hand. The influence of different parameter values on 
geodesics and Karcher means can be seen in Fig. [l]and Fig. 

Remark 3 (Generalizations to higher-dimensions). Sobolev metrics have natural 
generalizations to manifold-valued curves, embedded surfaces, and more generally 
spaces of immersions of a compact manifold M into a Riemannian manifold N] 
see EE for details and for a general overview. 

Deformations of curves can be seen as smooth paths c: [0,1] —)■ Imm(S'^,K'^). 
Their velocity is Ct, the subscript t denoting differentiation. 

Definition 2 (Geodesic distance). The length of a path of curves c is 
(4) L{c) = J ^Gc(t){ctit),ct{t))dt . 

The distance between two curves in Imm(S'^,M^^) (with respect to the metric G) is 
the infimum of the lengths of all paths connecting these curves. 

dist(co,ci)= inf L{c). 

c(0)=co,c(l)=ci 
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Geodesics are locally distance-minimizing paths. 


Geodesics can be described by a partial differential equation, called the geodesic 
equation. It is the first order condition for minima of the energy 


( 5 ) 


1 

E{c) = - j Gc(t)(ci(t),ct(t)) dt 


Recently, some local and global existence results for geodesics of Sobolev metrics 
were shown in 13 . Since they provide the theoretical underpinnings for the 


numerical methods presented in this paper, we summarize them here. 


Theorem 1 (Geodesic equation). Let 00,02 > 0 and oi > 0. The geodesic equation 
of the metric G, written in terms of the momentum p = \c'\ {aQCt — aiD'^ct + a 2 D'^ct), 
is given by 

dtp = - Y\(^e\Dsi{ct,Ct)Dsc) + Y\(^s\Dsi{DsCt, DsCt)Dsc) 

( 6 ) - ^\ce\D,i{D^,ct,DsCt)Dsc) + ^\cs\Ds{{D^,Cu D^,Ct)Dsc). 

Given any initial condition {cq,Uq) € nmm(5'^, the solution of the geodesic 
equation exists for all time. 

If, however, oo,oi > 0 and 02 = 0, then G is a first order Sobolev metric. Its 
geodesic equation is locally, but not globally, well-posed. 


Remark 4 (Comparison to elastic metrics). Closely related are elastic metrics 14 , 
which in the planar case are given by 


( 7 ) 


Gc{h,k)= / a^{Dsh,n){Dsk,n)-\-b‘^{Dsh,v){Dsk,v)ds . 


Here a,b are constants and v,n denote the unit tangent and n orm al vectors t o c . 
Two special cases deserve to be highlighted: for o = 1, 6 = | 


23 


and a = b 


26 


there exist nonlinear transforms, the square root velocity transform and the basic 
mapping, that greatly simplify numerics. Both of these metrics have been applied 
to a variety of problems in shape analysis. 

We note that the elastic metric with a = b corresponds to a first order Sobolev 
metric as in Def. with oq = 02 = 0 and ai = of = b^. As it has no L^-part, it is a 
Riemannian metric only on the space of curves modulo translations. 


3. Discretization 

We propose to represent paths of curves as tensor product B-splines 

Nt Ng 

(8) c(t,0) = ^^c,.,R,(t)C,(0) 

i=i j=i 

of degree nt in time it) and ng in space (6*), respectively, using Nt and Ng basis 
functions and controls Cij € The spatial basis functions Cj are defined on 
the uniform knot sequence 0j = , 0 < j < 2n Ng, and satisfy periodic 

boundary conditions. The time basis functions Bi are defined using uniform knots 
on the interior of the interval [0,1] and with full multiplicity on the boundary. Full 
multiplicity of the boundary knots ensures that fixing the initial and final curves 
is equivalent to fixing the control points Cij and Cpf_,j. Regarding smoothness, we 
have Bi e C”‘-i([0,1]) and Cj G C'”'>-i([0,’27r]). 
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The proposed discretization scheme is independent of the special form of the 
metric, and can be applied equally to the scale-invariant versions as well as to the 
family of elastic metrics. 


3.1. Boundary Value Problem for Geodesics. To evaluate the energy ([^ on 
discretized paths ([^, we use Gaussian quadrature on each interval between subse¬ 
quent knots. The resulting discretized energy iiidiscr(ci,i,... ,CNt,Ng) is minimized 
over all discrete paths ([^ with fixed initial and final control points Cij and 
This is a finite-dimensional, nonlinear, unconstrainecQ optimization problem. 

We solved this problem using either Matlab’s fminunc function with gradients 
computed by finite differences, or using AMPL [To] and the Ipopt solver 25 


with gradients computed by automatic differentiation. By classical approximation 
results 22 , the discrete energy of discrete paths converges to the continuous energy 
of smooth paths as the number of control points tends to infinity. Hence we expect 
to obtain close approximations of geodesics. Establishing rigorous convergence 
results for our discretization will be the subject of future work. 


3.2. Computing the Karcher Mean. The Karcher mean c of a set {ci,..., c„} 
of curves is the minimizer of 

1 ” 

(9) F(c) =-^dist(c,cj)^ . 


It can be calculated by a gradient descent on (Imm(S'^, G). Letting Log^Cj 


denote the Riemannian logarithm, the gradient of F with respect to G is 17 


( 10 ) 


grad*^ F{c) = - V Log, 

n ^^ 


c W 


i=i 


3.3. Initial Value Problem for Geodesics. To calculate the Karcher mean by 
gradient descent, one has to repeatedly solve the geodesic equation ([^. To this 
aim, we use the time-discrete variational geodesic calculus 21 . Given three curves 
Co, Cl) C 2 , one defines the discrete energy 


( 11 ) £'2(co. Cl, C2) = Geo (ci - Co, Cl - Co) -I- Gci(c 2 - Cl, C2 - Cl) • 

A 3-tuple (co,Ci,C 2 ) is a discrete geodesic if it is a minimizer of the discrete en¬ 
ergy E 2 with fixed endpoints co,C 2 . The discrete exponential map is defined 
as follows: C 2 = Exp^^ci, if (co,ci,C 2 ) is a discrete geodesic, in other words, if 
Cl = argminE 2 (co, •, C2). To find C2, we differentiate the discrete energy E 2 with 
respect to ci and solve the resulting system of nonlinear equations. 


(12) 2Gco(ci - Co, •) - 2 Gci(C 2 - Cl, •) + T>ciG.(c 2 - Ci,C2 - Cl) = 0 . 


We use the solver f solve in Matlab to solve this system of equations. This procedure 
is repeated depending on the desired time-resolution. 

While the convergence results in 


21 do not apply to our setting, we found good 


experimental agreement with the solutions of the boundary value problem. 


^We neglect the condition cg {t, 6) ^ 0. All smooth paths violating this condition have infinite 
energy because of the completeness of the metric. By approximation, spline paths violating this 
condition have very high energy and will, in practice, be avoided during optimization. 
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Figure 3. Eight boundaries of HeLa cell nuclei, the Karcher mean 
of all nuclei (enlarged), and six randomly sampled cells using a 
Gaussian distribution in normal coordinates with the same covari¬ 
ance as the data. 


4. Applications 


4.1. Hela Cells. In our first example we want to characterize nuclear shape vari¬ 
ation in HeLa cells. We use fluorescence microscope images of HeLa cell nuclej^ 
(87 images in total). The acquisition of the cells is described in [^. A similar 
study on this dataset was performed using the LDDMM framework in [19[|20| ; 
applying intrinsically defined Sobolev metrics to the same problem will provide a 
complimentary point of view. 

’ ' to 


To extract the boundary of the nucleus, we apply a thresholding method 16 


obtain a binary image and then fit - using least squares - a spline with Ng = 12 
and ng = 4 to the longest 4-connected component of the thresholded image. Then 
we reparametrize the boundary to approximately constant speed. The remaining 
degree of freedom is the starting point of the parametrization; when computing 
minimizing geodesics, we minimize over this parameter as well. The shapes of cell 
nuclei are thus represented by curves modulo translations, rotations, and constant 
shifts of the parametrisation, i.e. the two curves c and e*^c(- — a) -I- A are considered 
equivalent. Examples of the extracted curves are depicted in Fig. 

The parameters oq, ai, and 02 of the Riemannian metric are chosen as follows: 
we compute the average L^-, H^- and iJ^-contributions £^ 2 , Eh^, Eh^ to the 
energy of linear paths between each pair of curves in the dataset. As the L^- and 
-contributions scale differently, we rescale all curves such that E ]^2 = Efj 2 . Then 
we choose constants uq, ai, and 02 such that 

clqEi ,2 : aiEfji : a 2 Ej ^2 =3:1:6 and E = aoE /,2 -|- aiE^-^ a, 2 E[j 2 = 100 , 

resulting in an average length = 12.45 and ag = 3.36, oi = 2.20, and 02 = 6.73. 

The average shape of the nucleus can be captured by the Karcher mean c. To 
compute the Karcher mean of the 87 nuclei, we use a conjugate gradient method on 
the Riemannian manifold of curves, as implemented in the Manopt library |^, to 
solve the minimization problem (^. We obtain convergence in 28 steps; the final 
value of the objective function (lO) is E(c) = 10.55 and the norm of the gradient is 
II grad^ ^"(c)||c < 10“^. The mean shape can be seen in Fig. 

Having computed the mean c, we represent each nuclear shape cj by the initial 
velocity vj = Log^cj of the minimal geodesic connecting c and Cj. We perform 
principal component analysis with respect to the inner product Gc on the set of 
initial velocities {vj : j = 1,..., 87}. The first four eigenvalues are 4.10, 2.39,1.68 
and 1.00, and they explain 38.04%, 60.21%, 75.78%, and 85.07% of the total variance. 


“^The dataset was downloaded from http://inurphylab.web.cinu.edu/data 
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Figure 4. Geodesics from the mean in the first four principal 
directions. The curves show the geodesic at times —3, —2,..., 2, 3; 
the bold curve is the mean. One can see four different characteristic 
deformations of the cell: bending, stretching along the long axis, 
stretching along the short axis, and a combination of stretching 
with partial bending. 



Figure 5. All 87 cell nuclei projected to the plane in the tangent 
space which is spanned by the first two principal directions. The 
mean (in blue) is situated at the origin. The units on the coordinate 
axes are standard deviations. We see that the first coordinate is 
related to the bending of the nucleus, while the second coordinate 
is related to its elongation. 


Geodesics from the mean in the directions of the first four principal directions can 
be seen in Fig. S Fig.0 shows the data projected to the subspace spanned by the 
first two eigenvectors. 

Finally, we sample from a normal distribution with the same covariance matrix 
as the data and project the sampled velocities v back to the space of curves using 
the exponential map c = Expg-D; some examples can be seen in Fig. 

4.2. Traces of Cardiac Images. In our second application we study curves that 
are obtained from images of the cardiac cycle. More precisely, we consider a sequence 
of 30 cardiac images, taken at equispaced time points along the cardiac cycle. Each 
image is projected to a barycentric subspace of dimension two, yielding a closed 
curve in the two-dimensional space of barycentric coordinates. After normalizing the 
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Figure 6. Projections to a two-dimensional barycentric subspace 
of 30 images from the cardiac cycles of three patients. Cubic splines 
interpolation of degree ng = 3 with Ng = 30 control points is used. 


coordinates 18 Sect. 3] we obtain a closed, plane curve - with the curve parameter 
representing time - to which we can apply the methods presented in Sect.[^ Details 
regarding the acquisition and projection of the images can be found in |12l[24 
barycentric subspaces on manifolds are described in 


18 


The data consists of 10 cardiac cycles of patients with Tetralogy of Fallot and 
9 patients from a control group. Each cardiac cycle is originally represented by 
three-dimensional homogeneous coordinates xi : X 2 ■ X 3 , sampled at 30 time points. 
We project the homogeneous coordinates onto the plane xi + X 2 + X 3 = 1 and choose 
a two-dimensional coordinate system for this plane. Then we use spline interpolation 
with degree ng = 3 and Ng = 30 control points to reconstruct the planar curves 
from the data points; see Fig. 

The parameters oq, oi, and 02 in the metric are chosen similarly to Sect. 4.1 


however, the scale of the curves is not changed and we use equal weighting between 
the L^-, and iJ^-parts of the average energy for linear paths. This leads to 
parameters oq = 1, oi = 0.1, and 02 = 10“®. To see if the metric structure derived 
from the Sobolev metric enables us to distinguish between diseased patients and the 
control group, we compute all 171 pairwise distances between the 19 curves; this 
takes about 15 minutes on a 2 GHz single core processor. Multi-dimensional scaling 
of the distance matrix shows that the metric separates healthy and diseased patients 
quite well (Fig. Indeed, a cluster analysis based on the distance matrix recovers 
exactly - with exception of one outlier (patient 4) - the subgroups of healthy and 
diseased patients (Fig. Iz^)- 

The Karcher means of the healthy and diseased subgroups as well as of the 
entire population are depicted in Fig. [8| T he mean was computed using a gradient 
descent method as described in Sect. 13.2] with a threshold of 10““^ for the norm of 
the gradient. The average distance from the mean for the diseased group is 0.6853 
with a variance of 0.0149, and for the control group the distance is 0.7708 with a 
variance of 0.0083. 

To investigate the variability of the observed data, we performe principal compo¬ 
nent analysis on the initial velocities of the minimal geodesics connecting curves to 
the respective means (c.f. Sect. 4.1). Fig. shows the initial velocities projected to 
the subspace spanned by the first two principal directions. Within the healthy and 
sick subgroups, less then 30% of the principal components are needed to explain 
90% of the shape variation. If, in contrast, principal components are analyzed for 
the entire dataset based on the global Karcher mean, then 35% of the principal 
components are needed to explain 90% of the shape variation. 
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(A) (B) 


Figure 7. (a) Two dimensional representation of the data using 
multi dimensional scaling of the pairwise distance matrix, (b) A 
dendrogram of clusters computed from the pairwise distance matrix 
using the single linkage criterion. Healthy patients are labelled 1-9 
and diseased ones 10-19. 



Figure 8. First row: Karcher means of pathological cardiac cycles 
(left), all cycles (middle), and healthy cycles (right). Second row: 
geodesic connecting the Karcher mean of pathological cycles to the 
Karcher mean of healthy cycles. The crosses denote the position 
of images, with respect to whom the barycentric projection was 
computed. 


5. Conclusions 

In this article we numerically solved the initial and boundary value problems for 
geodesics on the space of regular curves under second order Sobolev metrics. We 
analyzed two medical datasets using our approach. In future work we plan to prove 
rigorous convergence results for our discretizations, further investigate the impact 
of the constants in the metric, and treat unparametrized curves. 

Acknowledgments. We would like to thank Xavier Pennec and Marc-Michel Rohe 
for providing us the cardiac image data and Hermann Schichl for his invaluable help 
with AMPL. 
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